An Imperfect Dopaminergic Error Signal Can Drive Temporal-Difference Learning

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Imperfect Dopaminergic Error Signal Can Drive Temporal-Difference Learning

An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to...

متن کامل

A Meta-learning Method Based on Temporal Difference Error

In general, meta-parameters in a reinforcement learning system, such as a learning rate and a discount rate, are empirically determined and fixed during learning. When an external environment is therefore changed, the sytem cannot adapt itself to the variation. Meanwhile, it is suggested that the biological brain might conduct reinforcement learning and adapt itself to the external environment ...

متن کامل

Analytical Mean Squared Error Curves in Temporal Difference Learning

Peter Dayan Brain and Cognitive Sciences E25-210, MIT Cambridge, MA 02139 [email protected] We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal difference value estimation algorithms change with offline updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve...

متن کامل

An Introduction to Temporal Difference Learning

Temporal Difference learning is one of the most used approaches for policy evaluation. It is a central part of solving reinforcement learning tasks. For deriving optimal control, policies have to be evaluated. This task requires value function approximation. At this point TD methods find application. The use of eligibility traces for backpropagation of updates as well as the bootstrapping of th...

متن کامل

Dual Temporal Difference Learning

Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of tempor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: PLoS Computational Biology

سال: 2011

ISSN: 1553-7358

DOI: 10.1371/journal.pcbi.1001133